Authors: Andreas Novotny (1,2), Caterina Rodrigues (1,2), Loïc Jacquemot (1,2), Rute B. G. Clemente-Carvalho (2), Rebecca S. Piercey (2), Evan Morien (2), Moira Galbraith (3), Colleen T. E. Kellogg (2), Matthew A. Lemay (2), and Brian P. V. Hunt (1,4)

  1. Institute for the Oceans and Fisheries, University of British Columbia, 2202 Main Mall, Vancouver, BC, V6T 1Z4 Canada

  2. Hakai Institute, PO Box 25039, Campbell River, BC, V9W 0B7, Canada

  3. Institute of Ocean Sciences, Fisheries and Oceans Canada, PO Box 6000, Sidney, BC, V8L 4B2, Canada

  4. Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia, 2202 Main Mall, Vancouver, BC, V6T 1Z4 Canada

Abstract

In this study, we examined how well DNA metabarcoding capture changes in biomass of marine mesozooplankton. We evaluated the effectiveness of three different standardisation methods – relative abundance, presence-absence, and the eDNA-index – for correlating seasonal trends between DNA sequence read abundance and microscopic biomass estimates of zooplankton net samples collected every two weeks in a 1-year time series. To assess if taxa specific trends are sensitive to the selection of genetic marker, we sequenced both short fragment of the COI gene, with primers optimised to capture marine invertebrate community (Leray et al., 2013) and the 18S gene, with universal primers optimised for marine microbial eukaryotes (Balzano et al., 2015). In doing so, we provide both new insights into the reliability of DNA metabarcoding analysis and guidelines for future studies that aim to use environmental DNA to study marine community dynamics. Further, through analysis of discreet water column DNA samples, we demonstrate the potential of the DNA approach in resolving differences in vertical zooplankton distributions.

0. Preparations

This R notebook was used to produce the graphical and statistical output for the manuscript and its supplementary figures and tables. See manuscript for details on sampling, experimental design and sample processing.

Prior ro analysis in this script 18S and COI amplicons have been sequenced on a MiSeq and raw Fastq sequences have been uploaded to NCBI under the BioProject accession number PRJNA1141475 (https://www.ncbi.nlm.nih.gov/sra/PRJNA1141475). All sample-specific accession numbers are listed in the metadata files. See further down.

DNA sequences (FASTQ files) were processed using a custom bioinformatic pipeline (see Supplementary Data 2: https://doi.org/10.5281/zenodo.14241795) using DADA2, with sequences annotated to the MetaZooGene database. ASV tables with taxonomic annotations are found in the ./Data folder.

These R packages are required for executing the code below:

# Data import
library(readxl)

# All data format modifications and filtering and plotting
library(tidyverse)
'%!in%' <- function(x,y)!('%in%'(x,y))

# This script contains a function for calculating eDNA-index.
source("Code/Functions.R")

# Specific plotting tools
library(UpSetR) # For making Euler diagrams
library(eulerr) # For making Euler diagrams
library(ggOceanMaps) # For maps
library(ggspatial)  # For map modifications
library(cowplot) # Simple plot themes
library(ggpubr) # Plot layout
library(patchwork) # Plot layout
library(ggnewscale) # Multiple scales on one plot

# Statistical analyses
library(vegan)

# Import all datasets from Bioinformatic Pipeline
#system2("cp", args = c(
#  "../CoastalBCdata/Data_output/To_ZoopMethodsComp/*",
#  "Data/"
#))

Sampling Station

Samples were collected during 2017 from station QU39 in the Salish Sea, British Columbia, Canada.

# 1. Draw inner map with points for stations

# Make a dataframe cointatining staion coordinates
pin <- data.frame(lon = c(-125.0992), 
                 lat = c(50.0307), 
                 label = c("QU39"))

# Plot inner map
inner <-
  # Define the limits (zoom of the map)
  basemap(limits = c(-126.5, -121.8, 48.3, 51.3), rotate = TRUE) +
  # Add red point for sampling station(s)
  geom_spatial_point(data = pin, aes(x = lon, y = lat), color = "red", size = 2) +
  # Add station labels, but adjuest position to not cover the point.
  geom_spatial_label(data = pin, aes(x = lon-0.35, y = lat-0.15, label = label)) +
  # Remove axis title
  theme(axis.title.x=element_blank(),
        axis.title.y=element_blank())


# 2. Draw outer map with box highlighting the inner map.

# Make a dataframe cointatining 4 coordinates defining box corners.
# Use same coordingates as the limits defined above.
insert_box <- data.frame(lon = c(-126.5, -126.5, -121.8, -121.8),
                         lat = c(48.3, 51.3, 51.3, 48.3))
# Plot outer map
outer <-
  # Define the limits (zoom of the map)
  basemap(limits = c(-130.0, -80.0, 35.0, 75.0), rotate = TRUE) +
  # Draw polygon for inner map
  geom_spatial_polygon(data = insert_box, aes(x = lon, y = lat),
                       fill = NA, color = "red") +
  # Remove axis title
  theme(axis.title.x=element_blank(),
        axis.title.y=element_blank())
## `geom_polypath()` is deprecated: use `ggplot2::geom_polygon()` with the `subgroup` aesthetic
# 3. Make outer plot as insert of inner. Define size and location.
final_map <- inner + inset_element(outer, 0.5, 0.5, 1, 1)
ggsave("Figure_output/Fig.1B_StationMap.pdf", final_map, width = 4, height = 4)
## Assuming `crs = 4326` in stat_spatial_identity()
## Assuming `crs = 4326` in stat_spatial_identity()
## Assuming `crs = 4326` in stat_spatial_identity()
final_map
## Assuming `crs = 4326` in stat_spatial_identity()
## Assuming `crs = 4326` in stat_spatial_identity()
## Assuming `crs = 4326` in stat_spatial_identity()

rm(inner, outer, insert_box, pin)

Load all data

Reading in all COI ASVs with taxonomic annotation and sample metadata.

# Sample metadata
metadata_COI <- read_csv("Data/Sequence_Metadata_COI_QU39_2017.csv")
## Rows: 257 Columns: 17
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (14): library_ID, lat_lon, source_material_id, title, sra_accession, bi...
## dbl   (2): depth, samp_store_temp
## date  (1): collection_date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# ASV and taxonomy table
ASV_COI <- read_csv("Data/ASV_COI_MZGdb_QU39_2017.csv")
## Rows: 213 Columns: 270
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (12): Kingdom, Phylum, Subphylum, Superclass, Subsuperclass, Class, Inf...
## dbl (258): Annotation_level, QMIC1955_COI_QU39-2017, QMIC1956_COI_QU39-2017,...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Turn ASV table into "long" and merge with metadata.
All_COI_data <- ASV_COI %>%
  pivot_longer(cols = !1:13, names_to = "library_ID", values_to = "Abundance") %>% 
  full_join(metadata_COI)
## Joining with `by = join_by(library_ID)`
rm(metadata_COI, ASV_COI)

Reading in all 18S ASVs with taxonomic annotation and sample metadata.

# Sample metadata
metadata_18S <- read_csv("Data/Sequence_Metadata_18S_QU39_2017.csv")
## Rows: 192 Columns: 17
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (14): library_ID, lat_lon, source_material_id, title, sra_accession, bi...
## dbl   (2): depth, samp_store_temp
## date  (1): collection_date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# ASV and taxonomy table
ASV_18S <- read_csv("Data/ASV_18S_MZGdb_QU39_2017.csv")
## Rows: 450 Columns: 205
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (12): Kingdom, Phylum, Subphylum, Superclass, Subsuperclass, Class, Inf...
## dbl (193): Annotation_level, QMIC1955_18S_QU39_3, QMIC1956_18S_QU39_3, QMIC1...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Turn ASV table into "long" and merge with metadata.
All_18S_data <- ASV_18S %>%
  pivot_longer(cols = !1:13, names_to = "library_ID", values_to = "Abundance") %>% 
  full_join(metadata_18S)
## Joining with `by = join_by(library_ID)`
rm(metadata_18S, ASV_18S)

For DNA, three different sample types were collected: Filtered Water, Bulk Zooplankton nets, and filtered ethanol preservative from the zooplankton nets. Here we construct a new Type variable that is based on the “collection_method” from the sample metadata.

# 1. Get levels for collection method:
#All_COI_data %>% 
#  pull(collection_method) %>% unique()

#. 2. Rename the three levels with more useable names:
Filtered_Water <- "Sterivex Seawater filtration: https://dx.doi.org/10.17504/protocols.io.rm7vzjxrrlx1/v1"
Bulk_Zooplankton <- "Vertical zooplankton net tow, preserved in 99 percent ethanol"
Zooplankton_Preservative <- "Vertical zooplankton net tow, preserved in 99 percent ethanol. Etnahol subsampling: https://dx.doi.org/10.17504/protocols.io.j8nlk888wl5r/v1"

# Add new Type variable to COI metadata. 
All_COI_data <- All_COI_data %>% 
  mutate(Type = ifelse(
    collection_method == Filtered_Water,
    "Filtered_Water",
    ifelse(
      collection_method == Bulk_Zooplankton,
      "Bulk_Zooplankton",
      "Zooplankton_Preservative")
  ))

# Add new Type variable to 18S metadata.
All_18S_data <- All_18S_data %>% 
  mutate(Type = ifelse(
    collection_method == Filtered_Water,
    "Filtered_Water",
    ifelse(
      collection_method == Bulk_Zooplankton,
      "Bulk_Zooplankton",
      "Zooplankton_Preservative")
  ))

Reading in data from taxonomic zooplankton analysis.

df_count_2017 <- read_csv("Data/Zooplankton_QU39_2017.csv")
## Rows: 11904 Columns: 32
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (18): Key, region_name, Station, Twilight, Net_Type, NOTES, Sample_ID, ...
## dbl  (11): lon, lat, Mesh_Size(um), Net_Mouth_Dia(m), DEPTH_STRT, DEPTH_END,...
## lgl   (1): CTD
## date  (1): Date
## time  (1): STN_TIME
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

1. Sample Quality Control

Sequencing Depth

Libraries with poor sequencing depth are removed. Our threshold is set to 10,000 reads per sequencing library. Further, in the MetaZooGene database, the genus Corycaeus was sometimes, but not always, labeled as Ditrichocorycaeus. Ditrichocorycaeus is a potential subgenus, but is not yet commonly accepted. We correct that here.

For COI:

# 1. Filter out libraries under threshold line of 10,000 reads per library
Good_COI_data <- All_COI_data %>% 
  group_by(library_ID) %>%
  filter(sum(Abundance)>10000) %>% 
  ungroup()

# 2.Summarize statistics of sequencing depth.  
print("Median COI library size:")
## [1] "Median COI library size:"
Good_COI_data %>%
  group_by(library_ID) %>% 
  summarise(SeqDepth = sum(Abundance)) %>%
  pull(SeqDepth) %>% median
## [1] 39642
print("Number of final COI libraries:")
## [1] "Number of final COI libraries:"
Good_COI_data %>% 
  pull(library_ID) %>% unique %>% length
## [1] 257
#### NOTE: Taxonomic correction of the subgenus Ditrichocorycaeus, not yet accepted.
Good_COI_data <- Good_COI_data %>%
  mutate(Genus = ifelse(Genus == "Ditrichocorycaeus", "Corycaeus", Genus),
         Species = ifelse(Species == "Ditrichocorycaeus anglicus","Corycaeus anglicus", Species))

Same for 18S:

# Filter out libraries with low read counts.
Good_18S_data <- All_18S_data %>% 
  group_by(library_ID) %>% 
  filter(sum(Abundance)>10000) %>% 
  ungroup()


print("Median 18S library size:")
## [1] "Median 18S library size:"
Good_18S_data %>% 
  group_by(library_ID) %>%
  summarise(SeqDepth = sum(Abundance)) %>%
  pull(SeqDepth) %>% median
## [1] 54078.5
print("Number of final 18S libraries:")
## [1] "Number of final 18S libraries:"
Good_18S_data %>% 
  pull(library_ID) %>% unique %>%  length
## [1] 188
#### NOTE: Taxonomic correction of the subgenus Ditrichocorycaeus, not yet accepted.
Good_18S_data <- Good_18S_data %>%
  mutate(Genus = ifelse(Genus == "Ditrichocorycaeus", "Corycaeus", Genus),
         Species = ifelse(Species == "Ditrichocorycaeus anglicus","Corycaeus anglicus", Species))

Summarize tables of sample frequency and sequencing depth. Creating table for zooplankton samples:

# 18S Zooplankton bulk data:
tmp_18S <- Good_18S_data %>%
  group_by(library_ID, source_material_id, collection_date, Type, depth) %>% 
  summarise(Abundance = sum(Abundance)) %>% 
  filter(Type == "Bulk_Zooplankton") %>% 
  ungroup() %>% 
  select(collection_date, Abundance) %>% 
  mutate(Marker = "18S_Sequences")
## `summarise()` has grouped output by 'library_ID', 'source_material_id',
## 'collection_date', 'Type'. You can override using the `.groups` argument.
# COI Zooplankton bulk data:
tmp_COI <- Good_COI_data %>%
  group_by(library_ID, source_material_id, collection_date, Type, depth) %>% 
  summarise(Abundance = sum(Abundance)) %>% 
  filter(Type == "Bulk_Zooplankton") %>% 
  ungroup() %>% 
  select(collection_date, Abundance) %>% 
  mutate(Marker = "COI_Sequences")
## `summarise()` has grouped output by 'library_ID', 'source_material_id',
## 'collection_date', 'Type'. You can override using the `.groups` argument.
# COI Zooplankton preservative
tmp_COI_EtOH <- Good_COI_data %>%
  group_by(library_ID, source_material_id, collection_date, Type, depth) %>% 
  summarise(Abundance = sum(Abundance)) %>% 
  filter(Type == "Zooplankton_Preservative") %>% 
  ungroup() %>% 
  select(collection_date, Abundance) %>% 
  mutate(Marker = "COI_Sequences_(Sup)")
## `summarise()` has grouped output by 'library_ID', 'source_material_id',
## 'collection_date', 'Type'. You can override using the `.groups` argument.
# Microscopy Zooplankton data
tmp_mic <- df_count_2017 %>%
  group_by(Sample_ID, Station, Date, STN_TIME) %>% 
  summarise(Biomass = sum(Biomass, na.rm = TRUE)) %>%
  ungroup() %>% 
  select(collection_date = Date, Abundance = Biomass) %>% 
  mutate(Marker = "Microscopy_Biomass")
## `summarise()` has grouped output by 'Sample_ID', 'Station', 'Date'. You can
## override using the `.groups` argument.
# Combine and turn into table.
bind_rows(tmp_mic, tmp_18S, tmp_COI, tmp_COI_EtOH) %>% 
  pivot_wider(names_from = Marker, values_from = Abundance) %>% 
  arrange(collection_date) %>% 
  write_csv(file = "Data_output/Sup.Tab.S1_ZooplanktonSampleSummary.csv")
  
rm(tmp_mic, tmp_18S, tmp_COI, tmp_COI_EtOH)

Creating table for water samples:

# 18S Water sample sequencing depth
tmp_18S <- Good_18S_data %>%
  group_by(library_ID, source_material_id, collection_date, Type, depth) %>% 
  summarise(Abundance = sum(Abundance)) %>% 
  filter(Type == "Filtered_Water") %>% 
  ungroup() %>% 
  select(collection_date, Abundance, depth) %>% 
  mutate(Marker = "18S_Sequences")
## `summarise()` has grouped output by 'library_ID', 'source_material_id',
## 'collection_date', 'Type'. You can override using the `.groups` argument.
# COI water sample sequencing depth
tmp_COI <- Good_COI_data %>%
  group_by(library_ID, source_material_id, collection_date, Type, depth) %>% 
  summarise(Abundance = sum(Abundance)) %>% 
  filter(Type == "Filtered_Water") %>% 
  ungroup() %>% 
  select(collection_date, Abundance, depth) %>% 
  mutate(Marker = "COI_Sequences")
## `summarise()` has grouped output by 'library_ID', 'source_material_id',
## 'collection_date', 'Type'. You can override using the `.groups` argument.
# Combine to table
bind_rows(tmp_18S, tmp_COI) %>%
  group_by(collection_date, depth, Marker) %>% 
  summarise(Abundance = sum(Abundance, na.rm = TRUE)) %>% 
  pivot_wider(names_from = Marker, values_from = Abundance) %>% 
  arrange(collection_date, as.numeric(depth)) %>%
  mutate(`Sequences COI/18S` = paste(COI_Sequences, `18S_Sequences`, sep = " / ")) %>% 
  select(collection_date, depth, `Sequences COI/18S`) %>% 
  pivot_wider(names_from = depth, values_from = `Sequences COI/18S`) %>% 
  write_csv2(file = "Data_output/Suppl.Tab.S2_WaterSampleSummary.csv")
## `summarise()` has grouped output by 'collection_date', 'depth'. You can
## override using the `.groups` argument.
rm(tmp_mic, tmp_18S, tmp_COI, tmp_COI_EtOH)
## Warning in rm(tmp_mic, tmp_18S, tmp_COI, tmp_COI_EtOH): object 'tmp_mic' not
## found
## Warning in rm(tmp_mic, tmp_18S, tmp_COI, tmp_COI_EtOH): object 'tmp_COI_EtOH'
## not found

Annotation success

This first plot shows the relative abundance of Metazooans, compared to other organisms for 18S and COI respectively.

# Calculate % metazoans for all samples in 18S data
summary_18S <- Good_18S_data %>%
  mutate(depth = ifelse(Type == "Bulk_Zooplankton", "230-0", depth)) %>%
  # Summarise abundance per kingdom
  group_by(collection_date, Kingdom, library_ID, depth) %>% 
  summarise(Abundance = sum(Abundance)) %>%
  # Calculate relative abuncance per sample
  group_by(collection_date, depth, library_ID) %>% 
  reframe(RA = Abundance/sum(Abundance), Kingdom, Abundance) %>%
  mutate(Marker = "18S")
## `summarise()` has grouped output by 'collection_date', 'Kingdom', 'library_ID'.
## You can override using the `.groups` argument.
# Calculate % metazoans for all samples in COI data
summary_COI <- Good_COI_data %>% 
  mutate(depth = ifelse(Type == "Bulk_Zooplankton", "230-0", depth)) %>%
  # Summarise abundance per kingdom
  group_by(collection_date, depth, Kingdom, library_ID) %>% 
  summarise(Abundance = sum(Abundance)) %>%
  # Calculate relative abuncance per sample
  group_by(collection_date, depth, library_ID) %>% 
  reframe(RA = Abundance/sum(Abundance), Kingdom, Abundance) %>%
  mutate(Marker = "COI")
## `summarise()` has grouped output by 'collection_date', 'depth', 'Kingdom'. You
## can override using the `.groups` argument.
# Merge the two datasets
bind_rows(summary_18S, summary_COI) %>%
  # Show only proportion of animals
  filter(Kingdom == "Animalia") %>%
  # Arrange sampling depths
  mutate(depth = ifelse(depth == "230", "260", depth)) %>% 
  mutate(depth = factor(depth,
                                   levels= c("0", "5", "30",
                                             "100", "260", "230-0"))) %>%
  # Plot boxplot
  ggplot() +
  geom_boxplot(aes(depth, RA, color = Marker)) +
  theme_cowplot(12) +
  theme(aspect.ratio = 1,
        strip.background =element_blank()) +
  scale_color_manual(values = c('#377eb8','#4daf4a'))

ggsave("Figure_output/Sup.Fig.S1A_AnnotationSuccess.pdf", width = 4, height = 4)

rm(summary_18S, summary_COI)

2. Data Transformation

In this section the different data formats of the 2017 QU39 time series will be mutated into similar data shapes and transformed in such a way that the data types can be compared. To keep filtering parameters equal for all data types, a set of functions are first defined, that are then executed for all data sets. Comparisons are done both at species and at genus level.

Key for data transformation is the index_RA function that is defined in ./Code/Functions.R. This function calculates eDNA-index and relative abundance for all taxa and samples. Read more details in the function description.

Transforming Microscopy data:

Microscopy count data from integrated zooplankton net tows:

# Species level
Net_count_2017_index_species <- df_count_2017 %>% 
  mutate(Depth = DEPTH_STRT,
         Taxa = paste(Genus, Tax1, sep = " "),
         Abundance = Biomass,
         Sample = Key) %>%
  index_RA(Sample, Taxa, Abundance, Depth, Date)
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
# Genus level
Net_count_2017_index_genus <- df_count_2017 %>% 
  mutate(Depth = DEPTH_STRT,
         Taxa = Genus,
         Abundance = Biomass,
         Sample = Key) %>%
  index_RA(Sample, Taxa, Abundance, Depth, Date)
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.

Transforming DNA data:

The DNA datasets are transformed identical to the microscopy data sets. Due to their similarity, a generalized function is first created, then executed for the data sets and taxonomic levels individually.

# Define wrapper function
calculate_index <- function(data, taxrank, type) {
  
  data %>% 
    # Filter Sample type
    filter(Type == type,
           # Only animals and annotated taxa are included
           Kingdom == "Animalia",
           str_detect({{ taxrank }}, "@", negate = TRUE)) %>%
    mutate(Taxa = {{ taxrank }},
           Depth = depth,
           Date = collection_date,
           Sample = library_ID) %>% 
    # Execute the transformation
    index_RA(Sample, Taxa, Abundance, Depth, Date)
}

# Execute function for all subsets:
Net_18S_2017_index_species <- calculate_index(Good_18S_data, Species, "Bulk_Zooplankton")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
Net_18S_2017_index_genus <- calculate_index(Good_18S_data, Genus, "Bulk_Zooplankton")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
eDNA_18S_2017_index_species <- calculate_index(Good_18S_data, Species, "Filtered_Water")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
eDNA_18S_2017_index_genus <- calculate_index(Good_18S_data, Genus, "Filtered_Water")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
Net_COI_2017_index_species <- calculate_index(Good_COI_data, Species, "Bulk_Zooplankton")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
Net_COI_2017_index_genus <- calculate_index(Good_COI_data, Genus, "Bulk_Zooplankton")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
eDNA_COI_2017_index_species <- calculate_index(Good_COI_data, Species, "Filtered_Water")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
eDNA_COI_2017_index_genus <- calculate_index(Good_COI_data, Genus, "Filtered_Water")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
EtOH_COI_2017_index_species <- calculate_index(Good_COI_data, Species, "Zooplankton_Preservative")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.
EtOH_COI_2017_index_genus <- calculate_index(Good_COI_data, Genus, "Zooplankton_Preservative")
## `summarise()` has grouped output by 'Sample', 'Taxa', 'Depth'. You can override
## using the `.groups` argument.

Defining a function for filtering singletons.

ListNoSing <- function(data, type = "DNA") {
  
  if (type == "DNA") {data <- filter(data, Abundance > 1)}
  
  data %>%
  group_by(Taxa) %>% 
  summarise(n=n()) %>% 
  filter(n>1) %>% 
  pull(Taxa)
}

3. Taxonomic Comparison

We compared the range of taxa that were captured with the Marine-Invertebrate COI primer vs the universal Eukaryote 18S primer, compared with microscopic zooplankton analysis.

3.1 Euler plots

Comparing zooplankton net samples, 18S vs COI vs microscopy.

# Genus level comp.
GenusListComp_1 <- list(
  `Net Microscopy` = ListNoSing(Net_count_2017_index_genus, "m"),
  `Net 18S` = ListNoSing(Net_18S_2017_index_genus),
  `Net COI` = ListNoSing(Net_COI_2017_index_genus)
)

pdf("Figure_output/Fig.2A_EulerGenusNet.pdf", width = 3.5, height = 3.5)
  set.seed(102)
  GenusListComp_1 %>% 
    fromList() %>% 
    euler(shape = "circle") %>% 
    plot(quantities = TRUE,
         lty = 1:5)
  dev.off()
## quartz_off_screen 
##                 2
# Species level comp.
SpeciesListComp_1 <- list(
  `Net Microscopy` = ListNoSing(Net_count_2017_index_species, "m"),
  `Net 18S` = ListNoSing(Net_18S_2017_index_species),
  `Net COI` = ListNoSing(Net_COI_2017_index_species)
)

pdf("Figure_output/Fig.2B_EulerSpeciesNet.pdf", width = 3.5, height = 3.5)
  SpeciesListComp_1 %>% 
    fromList() %>% 
    euler(shape = "circle") %>% 
    plot(quantities = TRUE,
        lty = 1:5)
  dev.off()
## quartz_off_screen 
##                 2

Comparing zooplankton net samples, with water samples.

# 18S and microscopy
ListComp_18S <- list(
  `Net Microscopy` = ListNoSing(Net_count_2017_index_genus, type = "m"),
  `Net 18S` = ListNoSing(Net_18S_2017_index_genus),
  `Water 18S` = ListNoSing(eDNA_18S_2017_index_genus)
)

pdf("Figure_output/Fig.S2A_EulerCOI.pdf", width = 3.5, height = 3.5)
  set.seed(102)
  ListComp_18S %>% 
    fromList() %>% 
    euler(shape = "circle") %>% 
    plot(quantities = TRUE,
         lty = 1:5)
  dev.off()
## quartz_off_screen 
##                 2
# COI and microscopy
ListComp_COI <- list(
  `Net Microscopy` = ListNoSing(Net_count_2017_index_genus, type = "m"),
  `Net COI` = ListNoSing(Net_COI_2017_index_genus),
  `Water COI` = ListNoSing(eDNA_COI_2017_index_genus)
)


pdf("Figure_output/Fig.S2B_Euler18S.pdf", width = 3.5, height = 3.5)
  ListComp_COI %>% 
    fromList() %>% 
    euler(shape = "circle") %>% 
    plot(quantities = TRUE,
        lty = 1:5)
  dev.off()
## quartz_off_screen 
##                 2
rm(GenusListComp_1, SpeciesListComp_1, ListComp_18S, ListComp_COI)

Conclusions:

  • At species level, COI appears to Identify conciderably more of the counted zooplankton taxa than 18S.

  • At genus level all methods align better. COI preforms better than 18S, and Net samples preform better than eDNA.

3.2 Proportion of detected biomass

From the Euler diagram it appears as only a fraction of counted zooplankton is detected by the DNA methods. However, the diagram values all taxa equally and does not account for the relative importance of the taxa. The following code extracts the biomass proportion of counted taxa that are also detected by COI and 18S respectively.

{ ### Preparing the data at Species Level:
Sum_per_species <- Net_count_2017_index_species %>% 
  filter(Taxa %in% ListNoSing(Net_count_2017_index_species, type = "m")) %>% 
  group_by(Taxa, Date) %>% 
  summarize(Biomass = sum(Abundance))

Sum_per_day <- Sum_per_species %>%
  group_by(Date) %>% 
  summarize(Counted_Biomass = sum(Biomass))

Detected_COI <- Sum_per_species %>% 
  filter(Taxa %in% ListNoSing(Net_COI_2017_index_species)) %>% 
  group_by(Date) %>% 
  summarize(Detected_COI = sum(Biomass)) %>% 
  left_join(Sum_per_day, by = "Date")

Detected_18S <- Sum_per_species %>% 
  filter(Taxa %in% ListNoSing(Net_18S_2017_index_species)) %>% 
  group_by(Date) %>% 
  summarize(Detected_18S = sum(Biomass)) %>% 
  left_join(Detected_COI, by = "Date")

Sum_per_day_all <- Sum_per_species %>% 
  filter(Taxa %in% c(ListNoSing(Net_18S_2017_index_species),
                     ListNoSing(Net_COI_2017_index_species))) %>% 
  group_by(Date) %>%
  summarize(Detected_18S_COI = sum(Biomass)) %>% 
  left_join(Detected_18S, by = "Date")

#Plot yearly average
fraction_species <- 
  Sum_per_day_all %>% 
  mutate(`Bulk COI` = Detected_COI / Counted_Biomass,
         `Bulk 18S` = Detected_18S / Counted_Biomass,
         `Bulk COI & 18S` = Detected_18S_COI /Counted_Biomass) %>% 
  select(Date, `Bulk COI`, `Bulk 18S`, `Bulk COI & 18S`) %>% 
  pivot_longer(2:4, names_to = "Marker", values_to = "Proportion") %>% 
  mutate(Level = "Species level")
}
## `summarise()` has grouped output by 'Taxa'. You can override using the
## `.groups` argument.
{ #Preparing the data at Genus Level:
Sum_per_genus <- Net_count_2017_index_genus %>% 
  filter(Taxa %in% ListNoSing(Net_count_2017_index_genus, type = "m")) %>% 
  group_by(Taxa, Date) %>% 
  summarize(Biomass = sum(Abundance))

Sum_per_day <- Sum_per_genus %>%
  group_by(Date) %>% 
  summarize(Counted_Biomass = sum(Biomass))

Detected_COI <- Sum_per_genus %>% 
  filter(Taxa %in% ListNoSing(Net_COI_2017_index_genus)) %>% 
  group_by(Date) %>% 
  summarize(Detected_COI = sum(Biomass)) %>% 
  left_join(Sum_per_day, by = "Date")
  
Detected_18S <- Sum_per_genus %>% 
  filter(Taxa %in% ListNoSing(Net_18S_2017_index_genus)) %>% 
  group_by(Date) %>% 
  summarize(Detected_18S = sum(Biomass)) %>% 
  left_join(Detected_COI, by = "Date")

Sum_per_day_all <- Sum_per_genus %>% 
  filter(Taxa %in% c(ListNoSing(Net_18S_2017_index_genus),
                     ListNoSing(Net_COI_2017_index_genus))) %>% 
  group_by(Date) %>%
  summarize(Detected_18S_COI = sum(Biomass)) %>% 
  left_join(Detected_18S, by = "Date")

fraction_genus <-   
  Sum_per_day_all %>% 
  mutate(`Bulk COI` = Detected_COI / Counted_Biomass,
         `Bulk 18S` = Detected_18S / Counted_Biomass,
         `Bulk COI & 18S` = Detected_18S_COI /Counted_Biomass) %>% 
  select(Date, `Bulk COI`, `Bulk 18S`, `Bulk COI & 18S`) %>% 
  pivot_longer(2:4, names_to = "Marker", values_to = "Proportion") %>% 
  mutate(Level = "Genus level")
}
## `summarise()` has grouped output by 'Taxa'. You can override using the
## `.groups` argument.
# Combine and plot
Biomass_fraction_plot <- bind_rows(fraction_genus, fraction_species) %>%
  #mutate(Level = factor(Level, levels = c("Species level", "Genus level"))) %>% 
  ggplot(aes(Marker, Proportion)) +
  geom_violin(aes(color = Marker)) +
  cowplot::theme_minimal_hgrid(12) +
  facet_wrap("Level") +
  theme(aspect.ratio = 1,
        axis.title.x=element_blank())
Biomass_fraction_plot

ggsave("Figure_output/Fig.2CD_BiomassFraction.pdf", width = 7, height = 4.5)


rm(Biomass_fraction_plot, Detected_18S, Detected_COI, fraction_genus,
   fraction_species, Sum_per_day, Sum_per_day_all,
   Sum_per_species, Sum_per_genus)

Conclusions:

  1. 18S (Balzano) Ccannot be used to identify taxa at species level.

  2. COI Covers around 38-75% of the ZP-net biomass at the specie level.

  3. All markers have higher agreement on the genus level, but COI still outperforms 18S also at the genus level.

  4. Combining 18S and COI is slightly, but only marginally better than using COI alone.

3.3 Exclusive taxa

So the DNA methods combined targets 50-70% of zooplankton my biomass. Here we investigate the remaining genera, that remains un-targeted by the DNA methods. The Genera are ordered by relative biomass contribution.

# Detected by BOTH microscopy and ny of the other methods
microscopy_and_DNA <- Net_count_2017_index_genus %>% 
  filter(Taxa %in% ListNoSing(., "m"),
         Taxa %in% c(ListNoSing(Net_18S_2017_index_genus),
                     ListNoSing(Net_COI_2017_index_genus))) %>% 
  
  group_by(Taxa) %>% 
  summarise(Abundance = sum(Abundance)) %>% 
  mutate(RA = Abundance/sum(Abundance)) %>% 
  filter(RA > 0.000) %>% 
  mutate(Type = "Microscopy and DNA")


# Only detected by microscopy
only_microscopy <- Net_count_2017_index_genus %>% 
  filter(Taxa %in% ListNoSing(Net_count_2017_index_genus, type = "m"),
         Taxa %!in% ListNoSing(Net_18S_2017_index_genus),
         Taxa %!in% ListNoSing(Net_COI_2017_index_genus)) %>% 
  
  group_by(Taxa, Date) %>% 
  summarize(RA = mean(RA)) %>% 
  group_by(Taxa) %>% 
  summarize(RA = mean(RA)) %>%
  filter(RA > 0.000) %>% 
  mutate(Type = "Only by Microscopy")
## `summarise()` has grouped output by 'Taxa'. You can override using the
## `.groups` argument.
# Only detected by COI
only_COI <- Net_COI_2017_index_genus %>%
  filter(Taxa %in% ListNoSing(Net_COI_2017_index_genus)) %>% 
  filter(Taxa %!in% ListNoSing(Net_count_2017_index_genus, type = "m")) %>% 
  group_by(Taxa) %>% 
  summarize(RA = mean(RA)) %>%
  mutate(Type = "Only by COI")
only_COI
## # A tibble: 19 × 3
##    Taxa                       RA Type       
##    <chr>                   <dbl> <chr>      
##  1 Aurelia            0.00000891 Only by COI
##  2 Balanus            0.00184    Only by COI
##  3 Cancer             0.00131    Only by COI
##  4 Crepipatella       0.00000105 Only by COI
##  5 Eualus             0.000217   Only by COI
##  6 Euphysa            0.00000948 Only by COI
##  7 Eusergestes        0.000133   Only by COI
##  8 Harmothoe          0.000176   Only by COI
##  9 Hermissenda        0.0000182  Only by COI
## 10 Membranipora       0.000101   Only by COI
## 11 Mesochaetopterus   0.000335   Only by COI
## 12 Metacarcinus       0.000104   Only by COI
## 13 Nematoscelis       0.00000501 Only by COI
## 14 Ophiopholis        0.0000405  Only by COI
## 15 Petrasma           0.00000967 Only by COI
## 16 Placida            0.00170    Only by COI
## 17 Plotocnide         0.00000217 Only by COI
## 18 Spirontocaris      0.000448   Only by COI
## 19 Strongylocentrotus 0.0000415  Only by COI
# Only detected by 18S
only_18S <- Net_18S_2017_index_genus %>% 
  filter(Taxa %in% ListNoSing(Net_18S_2017_index_genus),
         Taxa %!in% ListNoSing(Net_count_2017_index_genus, type = "m")) %>% 
  group_by(Taxa) %>% 
  summarize(RA = mean(RA)) %>%
  mutate(Type = "Only by 18S")

#Combine and plot
bind_rows(microscopy_and_DNA, only_microscopy, only_COI, only_18S) %>% 
  mutate(Type = factor(Type, levels = c(
    "Microscopy and DNA", "Only by Microscopy", "Only by COI", "Only by 18S"
  ))) %>%
  filter(RA > 0.001) %>%  ##### NOTE: REMOVED LOW COUNTS! #######
  ggplot() +
  geom_bar(aes(reorder(Taxa, -RA), RA), stat = "identity") +
  theme_minimal_hgrid(9) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1,
                                   face = "italic")) +
  facet_wrap("Type", scales = "free_x") +
  ylab("Relative abundance") +
  xlab("Genus")

ggsave("Figure_output/Fig.3_RemainingTaxa.pdf")
## Saving 7 x 5 in image
rm(only_18S, only_COI, only_microscopy, microscopy_and_DNA)

Conclusions

  1. 18S Cannot be used to identify taxa at species level, the gene is to conserved.

  2. COI Covers around 38-75% of the ZP-net biomass at the specie level.

  3. All markers have higher agreement on the genus level, but COI still outperforms 18S also at the genus level.

  4. Combining 18S and COI is slightly, but only marginally better than using COI alone.

  5. Out of the biomass not detected by the molecular markers, larger taxa, such as amphipods, has the highest biomass.

  6. eDNA of COI and 18S catches many different metazooan taxa not targeted by microscopy. Most of these taxa are benthic mollusks, worms, echinoderms or cnidarians.

Selection of taxa subsets.

To allow for plotting a subset of taxa, we define

# Vector for all genera
GenusList <-
  Net_count_2017_index_genus %>%
  
  # Select taxa present in both microscopy and DNA NET samples
  filter(Taxa %in% ListNoSing(Net_count_2017_index_genus, "m"),
         Taxa %in% c(ListNoSing(Net_18S_2017_index_genus),
                     ListNoSing(Net_COI_2017_index_genus))) %>%
  # Pull list
  pull(Taxa) %>% unique()


# Vector for the top 27 abundant genera
GenusList27 <-
  Net_count_2017_index_genus %>%
  
  # Select taxa present in both microscopy and DNA NET samples
  filter(Taxa %in% ListNoSing(Net_count_2017_index_genus, "m"), 
         Taxa %in% c(ListNoSing(Net_18S_2017_index_genus),
                     ListNoSing(Net_COI_2017_index_genus))) %>%
  
  # Select the 27 most abundant taxa
  group_by(Taxa) %>% 
  summarize(RA = mean(RA)) %>% 
  arrange(-RA) %>% 
  mutate(count = 1:35) %>% 
  filter(count < 28) %>% 
  
  # Pull list
  pull(Taxa) %>% unique()

# Vector for the top 15 abundant genera
GenusList15 <-
  Net_count_2017_index_genus %>%
  
  # Select taxa present in both microscopy and DNA NET samples
  filter(Taxa %in% ListNoSing(Net_count_2017_index_genus, "m"),
         Taxa %in% c(ListNoSing(Net_18S_2017_index_genus),
                     ListNoSing(Net_COI_2017_index_genus))) %>%
  
  # Select the 27 most abundant taxa
  group_by(Taxa) %>% 
  summarize(RA = mean(RA)) %>% 
  arrange(-RA) %>% 
  mutate(count = 1:35) %>% 
  filter(count < 16) %>%
  
  # Pull list
  pull(Taxa) %>% unique()


# At Species Level
SpeciesList <-
  Net_count_2017_index_species %>% 
  filter(Taxa %in% ListNoSing(., "m"),
         Taxa %in% c(ListNoSing(Net_COI_2017_index_species),
                     ListNoSing(eDNA_COI_2017_index_species))) %>% 
  pull(Taxa) %>% unique()

4. Performance of normalization methods

We compare the preformace of normalization methods (Relative abundance RA, Pressence/Absence PA, and eDNA-index RA-index) for the Zooplankton net samples, to compare microscopy, 18S and COI analysis methods. To do this, we first construct a combined dataset:

Net_merged_data <-
  # Combine data sets
  bind_rows(
    mutate(Net_count_2017_index_genus, marker = "Microscopy"),
    mutate(Net_COI_2017_index_genus, marker = "COI"),
    mutate(Net_18S_2017_index_genus, marker = "18S")) %>% 
  
  # Calculate pressence / Absence from the relative abundance
  mutate(PA = ifelse(RA>0.001, 1, 0)) %>% 
  
  # Remove unwanted taxa and columns
  filter(Taxa %in% GenusList) %>%
  select(Taxa, Date, Abundance, RA, RA_Index, PA, marker) %>% 
  
  # Wrangle the data set to make tidy for comparisons between Molecular/Microscopy
  # Pivot transformation method long format. 
  pivot_longer(3:6, names_to = "Index_Type", values_to = "value") %>% 
  
  # Pivot marker (18S, COI, microscopy) wide.
  pivot_wider(names_from = marker, values_from = value) %>%
  
  # Make pairwise comparison: 18S-microscopy vs COI-microscopy
  pivot_longer(5:6, names_to = "Gene_Marker", values_to = "Molecular") %>% 
  select(Taxa, Date, Gene_Marker, Index_Type, Microscopy, Molecular) %>%
  
  # Remove samples with no matches Molecular/Mic
  filter(is.na(Molecular) == FALSE,
         is.na(Microscopy) == FALSE)

4.1 Multivariate comparison

To evaluate the effect of analysis and normalization method on the merged datasets, a permutational multivariate analysis of variance (permanova) is conducted with 999 permutations based on the Bray-Curtis distances between samples in the community matrix using adonis from the vegan R package v. 2.6-4 (Oksanen et al., 2007). The results are visualized with 2-dimensional non-metric multidimensional scaling (NMDS) using the metaMDS function from the vegan package (allowing for 100 iterations). Mantel test was used to test for covariance between sample distance matrixes produced with microscopy vs molecular data using the mantel function in vegan with 9999 permutations.

Multivariate Function

To be able to repeat multivariate statistics and plots, multivariate statistics are defined as a function:

NMDS <- function(data = Net_merged_data, Marker = "COI", Type = "RA") {
  
  # Construct community matrix dataframe 
  wide <- filter(data, Gene_Marker == Marker, Index_Type == Type) %>%
    select(Taxa, Date, Microscopy, Molecular) %>%
    pivot_longer(3:4, names_to = "Method", values_to = "Value") %>% 
    pivot_wider(names_from = Taxa, values_from = Value)
  
  ## MANTEL TEST
  # Matrix for molecular
  molmat <- wide %>% 
    filter(Method == "Molecular") %>% 
    mutate(Sample = Date) %>%
    select(!1:2) %>% 
    column_to_rownames("Sample") %>% 
    as.matrix()
  
  # Matrix for microscopy 
  micmat <- wide %>% 
    filter(Method == "Microscopy") %>% 
    mutate(Sample = Date) %>%
    select(!1:2) %>% 
    column_to_rownames("Sample") %>% 
    as.matrix()
  
  # Control that sample names are the same
  control <- ifelse(row.names(molmat) != row.names(micmat),"WARNING", "OK")
  ifelse("WARNING" %in% control, print("WARNING"), print("OK"))
  
  # Execute the mantel test
  m.mod <- mantel(vegdist(molmat), vegdist(micmat))
  
  ## PERMANOVA and NMDS
  # Combined community matrix
  mat <- wide %>% 
    mutate(Sample = paste(Date, Method, sep = "_")) %>%
    select(!1:2) %>% 
    column_to_rownames("Sample") %>% 
    as.matrix()
  
  # Execute permanova
  permanova <- adonis2(mat ~ Method+Date,
                       data = mutate(wide, Date = format(as.Date(Date),
                                                         format="%m")),
                       permutations = 999, method="bray")
  
  # Execute NMDS
  mod <- metaMDS(mat, trymax = 100)
  
  
  # DATA EXTRACTION
  
  # Extract score coordinates from NMDS
  NMDS_scores <- function(mod) {
    Samples <- 
    scores(mod)$sites %>% 
    as.data.frame() %>%
    rownames_to_column("Name") %>% 
    separate(Name, into = c("Date", "Method"), sep = "_") %>% 
    mutate(Layer = "Samples")

  Taxa <-
    scores(mod)$species %>% 
    as.data.frame() %>%
    rownames_to_column("Taxa") %>% 
    mutate(Layer = "Taxa")

  bind_rows(Samples, Taxa)
  } # End NMDS Scores
  
  out <- NMDS_scores(mod) %>% 
    mutate(Gene_Marker = Marker, Index_Type = Type)
  
  
  # Combine data set of statistical output
  stats <- tibble(Marker = Marker,
       Type = Type,
       NMDS.stress = mod$stress,
       Method.R2 = permanova$R2[1],
       Method.F = permanova$F[1],
       Method.P = permanova$`Pr(>F)`[1],
       Date.R2 = permanova$R2[2],
       Date.F = permanova$F[2],
       Date.P = permanova$`Pr(>F)`[2],
       Mantel.R = m.mod$statistic,
       Mantel.P = m.mod$signif)
  
  output <- list(NMDS.scores = out,
                 NMDS.stressplot = stressplot(mod),
                 Stats = stats)
  
  
  return(output)
}

Execute for all datasets

mod1 <- NMDS(Marker = "COI", Type = "RA")
## [1] "OK"
## Run 0 stress 0.1729314 
## Run 1 stress 0.1729314 
## ... New best solution
## ... Procrustes: rmse 3.294049e-05  max resid 0.0001528643 
## ... Similar to previous best
## Run 2 stress 0.1797813 
## Run 3 stress 0.1830087 
## Run 4 stress 0.1728551 
## ... New best solution
## ... Procrustes: rmse 0.009756111  max resid 0.04485131 
## Run 5 stress 0.1729314 
## ... Procrustes: rmse 0.009757584  max resid 0.04486873 
## Run 6 stress 0.1830087 
## Run 7 stress 0.2133701 
## Run 8 stress 0.1797813 
## Run 9 stress 0.1798019 
## Run 10 stress 0.21773 
## Run 11 stress 0.1903895 
## Run 12 stress 0.1921537 
## Run 13 stress 0.1904928 
## Run 14 stress 0.182842 
## Run 15 stress 0.1729314 
## ... Procrustes: rmse 0.00975637  max resid 0.04486273 
## Run 16 stress 0.1830087 
## Run 17 stress 0.1728551 
## ... New best solution
## ... Procrustes: rmse 6.716636e-05  max resid 0.0003015934 
## ... Similar to previous best
## Run 18 stress 0.1728551 
## ... New best solution
## ... Procrustes: rmse 2.957596e-06  max resid 1.278831e-05 
## ... Similar to previous best
## Run 19 stress 0.182842 
## Run 20 stress 0.1729314 
## ... Procrustes: rmse 0.009753565  max resid 0.04482468 
## *** Best solution repeated 1 times

mod2 <- NMDS(Marker = "COI", Type = "PA")
## [1] "OK"
## Run 0 stress 0.1782295 
## Run 1 stress 0.1799811 
## Run 2 stress 0.2109472 
## Run 3 stress 0.1879819 
## Run 4 stress 0.1799811 
## Run 5 stress 0.1799811 
## Run 6 stress 0.1782877 
## ... Procrustes: rmse 0.005405277  max resid 0.02675052 
## Run 7 stress 0.1782876 
## ... Procrustes: rmse 0.00539689  max resid 0.02673496 
## Run 8 stress 0.1869326 
## Run 9 stress 0.1799717 
## Run 10 stress 0.1799811 
## Run 11 stress 0.1799811 
## Run 12 stress 0.1876046 
## Run 13 stress 0.1942479 
## Run 14 stress 0.1782876 
## ... Procrustes: rmse 0.005401527  max resid 0.02673718 
## Run 15 stress 0.1782296 
## ... Procrustes: rmse 2.754245e-05  max resid 8.334306e-05 
## ... Similar to previous best
## Run 16 stress 0.2123819 
## Run 17 stress 0.1799716 
## Run 18 stress 0.1879819 
## Run 19 stress 0.1799811 
## Run 20 stress 0.1799811 
## *** Best solution repeated 1 times

mod3 <- NMDS(Marker = "COI", Type = "RA_Index")
## [1] "OK"
## Run 0 stress 0.229946 
## Run 1 stress 0.2589802 
## Run 2 stress 0.2299827 
## ... Procrustes: rmse 0.003568419  max resid 0.01674597 
## Run 3 stress 0.2332293 
## Run 4 stress 0.2330485 
## Run 5 stress 0.2363588 
## Run 6 stress 0.2299247 
## ... New best solution
## ... Procrustes: rmse 0.01241305  max resid 0.06345527 
## Run 7 stress 0.2794731 
## Run 8 stress 0.2337812 
## Run 9 stress 0.2327175 
## Run 10 stress 0.2752637 
## Run 11 stress 0.229851 
## ... New best solution
## ... Procrustes: rmse 0.004232046  max resid 0.02041925 
## Run 12 stress 0.2548352 
## Run 13 stress 0.2363588 
## Run 14 stress 0.2299245 
## ... Procrustes: rmse 0.004024504  max resid 0.01965912 
## Run 15 stress 0.2364958 
## Run 16 stress 0.2298242 
## ... New best solution
## ... Procrustes: rmse 0.003470575  max resid 0.01634122 
## Run 17 stress 0.2299827 
## ... Procrustes: rmse 0.01167224  max resid 0.06471058 
## Run 18 stress 0.2363643 
## Run 19 stress 0.2363643 
## Run 20 stress 0.2299247 
## ... Procrustes: rmse 0.005560408  max resid 0.02046456 
## Run 21 stress 0.2546502 
## Run 22 stress 0.2364836 
## Run 23 stress 0.2298242 
## ... Procrustes: rmse 4.711618e-06  max resid 1.24609e-05 
## ... Similar to previous best
## *** Best solution repeated 1 times

mod4 <- NMDS(Marker = "18S", Type = "RA")
## [1] "OK"
## Run 0 stress 0.1515485 
## Run 1 stress 0.2039911 
## Run 2 stress 0.1515485 
## ... Procrustes: rmse 6.076688e-06  max resid 3.176846e-05 
## ... Similar to previous best
## Run 3 stress 0.1515485 
## ... New best solution
## ... Procrustes: rmse 5.453497e-06  max resid 2.94784e-05 
## ... Similar to previous best
## Run 4 stress 0.1898251 
## Run 5 stress 0.1515485 
## ... Procrustes: rmse 3.977544e-06  max resid 2.128542e-05 
## ... Similar to previous best
## Run 6 stress 0.1515485 
## ... Procrustes: rmse 5.772845e-06  max resid 3.16019e-05 
## ... Similar to previous best
## Run 7 stress 0.1515485 
## ... Procrustes: rmse 1.623876e-05  max resid 9.102008e-05 
## ... Similar to previous best
## Run 8 stress 0.1515485 
## ... Procrustes: rmse 4.169585e-06  max resid 2.043555e-05 
## ... Similar to previous best
## Run 9 stress 0.1515485 
## ... Procrustes: rmse 7.08425e-06  max resid 3.92784e-05 
## ... Similar to previous best
## Run 10 stress 0.1897312 
## Run 11 stress 0.1515485 
## ... Procrustes: rmse 5.289805e-06  max resid 2.350063e-05 
## ... Similar to previous best
## Run 12 stress 0.1515485 
## ... Procrustes: rmse 4.000149e-06  max resid 1.34316e-05 
## ... Similar to previous best
## Run 13 stress 0.1515485 
## ... Procrustes: rmse 7.698472e-06  max resid 4.324141e-05 
## ... Similar to previous best
## Run 14 stress 0.1515485 
## ... New best solution
## ... Procrustes: rmse 1.455532e-06  max resid 3.81463e-06 
## ... Similar to previous best
## Run 15 stress 0.1515485 
## ... Procrustes: rmse 2.587979e-06  max resid 1.181256e-05 
## ... Similar to previous best
## Run 16 stress 0.1515485 
## ... Procrustes: rmse 4.030675e-06  max resid 2.244695e-05 
## ... Similar to previous best
## Run 17 stress 0.1515485 
## ... Procrustes: rmse 2.780907e-06  max resid 1.433897e-05 
## ... Similar to previous best
## Run 18 stress 0.1515485 
## ... Procrustes: rmse 1.112566e-05  max resid 6.340486e-05 
## ... Similar to previous best
## Run 19 stress 0.1515485 
## ... Procrustes: rmse 4.903873e-06  max resid 2.484589e-05 
## ... Similar to previous best
## Run 20 stress 0.1515485 
## ... Procrustes: rmse 4.636724e-06  max resid 2.589168e-05 
## ... Similar to previous best
## *** Best solution repeated 7 times

mod5 <- NMDS(Marker = "18S", Type = "PA")
## [1] "OK"
## Run 0 stress 0.1648822 
## Run 1 stress 0.202059 
## Run 2 stress 0.1768417 
## Run 3 stress 0.237987 
## Run 4 stress 0.180124 
## Run 5 stress 0.2079033 
## Run 6 stress 0.180124 
## Run 7 stress 0.1775264 
## Run 8 stress 0.1648822 
## ... New best solution
## ... Procrustes: rmse 2.765246e-06  max resid 9.167204e-06 
## ... Similar to previous best
## Run 9 stress 0.2194206 
## Run 10 stress 0.1648822 
## ... New best solution
## ... Procrustes: rmse 5.070433e-06  max resid 2.010249e-05 
## ... Similar to previous best
## Run 11 stress 0.1768417 
## Run 12 stress 0.2093228 
## Run 13 stress 0.1775264 
## Run 14 stress 0.2065656 
## Run 15 stress 0.1648822 
## ... Procrustes: rmse 6.572714e-06  max resid 2.828898e-05 
## ... Similar to previous best
## Run 16 stress 0.2025463 
## Run 17 stress 0.199336 
## Run 18 stress 0.2062758 
## Run 19 stress 0.180124 
## Run 20 stress 0.1775264 
## *** Best solution repeated 2 times

mod6 <- NMDS(Marker = "18S", Type = "RA_Index")
## [1] "OK"
## Run 0 stress 0.22905 
## Run 1 stress 0.2291326 
## ... Procrustes: rmse 0.004810171  max resid 0.02195877 
## Run 2 stress 0.22905 
## ... New best solution
## ... Procrustes: rmse 4.407711e-05  max resid 0.0001634632 
## ... Similar to previous best
## Run 3 stress 0.22905 
## ... Procrustes: rmse 2.091524e-05  max resid 9.052574e-05 
## ... Similar to previous best
## Run 4 stress 0.22905 
## ... Procrustes: rmse 1.428698e-05  max resid 4.726789e-05 
## ... Similar to previous best
## Run 5 stress 0.22905 
## ... Procrustes: rmse 1.164494e-05  max resid 4.209806e-05 
## ... Similar to previous best
## Run 6 stress 0.22905 
## ... Procrustes: rmse 1.654146e-05  max resid 6.480173e-05 
## ... Similar to previous best
## Run 7 stress 0.2577171 
## Run 8 stress 0.22905 
## ... Procrustes: rmse 8.385198e-06  max resid 3.63588e-05 
## ... Similar to previous best
## Run 9 stress 0.22905 
## ... Procrustes: rmse 2.442364e-05  max resid 8.089969e-05 
## ... Similar to previous best
## Run 10 stress 0.2451632 
## Run 11 stress 0.2386129 
## Run 12 stress 0.256818 
## Run 13 stress 0.238613 
## Run 14 stress 0.2291325 
## ... Procrustes: rmse 0.004812485  max resid 0.02194858 
## Run 15 stress 0.22905 
## ... Procrustes: rmse 2.324002e-05  max resid 8.621688e-05 
## ... Similar to previous best
## Run 16 stress 0.2458877 
## Run 17 stress 0.2419902 
## Run 18 stress 0.22905 
## ... Procrustes: rmse 2.967634e-05  max resid 0.0001038422 
## ... Similar to previous best
## Run 19 stress 0.22905 
## ... Procrustes: rmse 2.530951e-05  max resid 0.0001150715 
## ... Similar to previous best
## Run 20 stress 0.2378508 
## *** Best solution repeated 10 times

bind_rows(
  mod1$Stats, mod2$Stats, mod3$Stats, mod4$Stats, mod5$Stats, mod6$Stats) %>% 
  write_csv(file = "./Figure_output/Tab.1_MultivariateStats.csv")

Plot NMDS

We use the combined output from abuve to plot NMDS:

# Define a seasonal color scheme:
seasonal_colors = c("#053061", "#3288BD", "#66C2A5", "#7FBC41",
                    "#A6D96A", "#FEE08B", "#FDAE61", "#F46D43",
                    "#D53E4F", "#9E0142", "#67001F", "#40004B")


# Combine multivariate outputs:
bind_rows(mod1$NMDS.scores, mod2$NMDS.scores, mod3$NMDS.scores,
          mod4$NMDS.scores, mod5$NMDS.scores, mod6$NMDS.scores) %>% 
  pivot_wider(names_from = Layer,
              values_from = c(NMDS1, NMDS2)) %>%
  # Modify scales and factors
  mutate(Month = format(as.Date(Date), format="%m")) %>%
  mutate(Index_Type = ifelse(Index_Type == "RA_Index", "eDNA Index", Index_Type)) %>% 
  mutate(Index_Type = factor(Index_Type, levels = c("RA", "PA", "eDNA Index"))) %>% 
  mutate(Taxa = ifelse(Taxa %in% GenusList15, Taxa, "")) %>% 
  
  # Plot
  ggplot(aes(NMDS1_Samples, NMDS2_Samples)) +
  
  # Taxa are represented as text
  geom_text(aes(NMDS1_Taxa, NMDS2_Taxa, label = Taxa),
            size = 3, colour = "grey", fontface = "italic") +
  
  # Dotted lines combine microscopy and DNA samples from the same date
  geom_line(aes(color = Month, group = Date), linetype = "dotted") +
  
  # Points represent each net.
  geom_point(aes(shape = Method, color = Month), size = 2) +
  
  # Eclipses for CI of the method (DNA vs Microscopy)
  stat_ellipse(aes(linetype = Method), size = 0.2) +
  
  # Facet Rows and Columns
  facet_grid(Index_Type~Gene_Marker, scales = "free") +
  
  # Esthetics
  theme_minimal_grid(12) +
  theme(aspect.ratio = 1,
        panel.border = element_rect(colour = "black", fill=NA),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank()) +
  scale_color_manual(values = seasonal_colors) +
  scale_shape_manual(values=c(1, 16))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 138 rows containing non-finite outside the scale range
## (`stat_ellipse()`).
## Warning: Removed 248 rows containing missing values or values outside the scale range
## (`geom_text()`).
## Warning: Removed 138 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 138 rows containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("Figure_output/Fig.3_NMDS.pdf", width = 8, height = 10)
## Warning: Removed 138 rows containing non-finite outside the scale range
## (`stat_ellipse()`).
## Warning: Removed 248 rows containing missing values or values outside the scale range
## (`geom_text()`).
## Warning: Removed 138 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 138 rows containing missing values or values outside the scale range
## (`geom_point()`).

4.2 Regression analyses

linear regressions between microscopy and molecular data were modelled, using the lm function.

Plot correlations

Both 18S and COI combined, Relative abundance vs eDNA-index

Net_merged_data %>% 
  filter(Index_Type %in% c("RA_Index", "RA")) %>%
  #filter(Molecular > 0, Microscopy > 0) %>% 
  ggplot(aes(Microscopy, Molecular)) +
  geom_point() +
  geom_smooth(aes(group = Taxa), color = "grey", method = lm, se=F) +
  geom_smooth(color = "#984ea3", method = lm, se=F) +
  theme_minimal_grid() +
  facet_wrap("Index_Type", scales = "free") +
  theme(aspect.ratio = 1,
        panel.background = element_rect(colour = 'darkgrey')) +
  ylim(0,1) +
  xlim(0,1)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 103 rows containing missing values or values outside the scale range
## (`geom_smooth()`).

ggsave("Figure_output/Fig.S3AB_Regression.pdf", width = 7, height = 4)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 103 rows containing missing values or values outside the scale range
## (`geom_smooth()`).

Calculate Rsq, P and Slope for all regressions:

#Create new tables for regression summary and calculate coefficients 
# Empty table
species_list <- data.frame(Taxa = unique(Net_merged_data$Taxa),
                           RA_Rsq = 0,
                           #I_Rsq = 0,
                           
                           RA_p.val = 0,
                           #I_p.val = 0,
                           
                           RA_B0 = 0,
                           I_B0 = 0,
                           
                           RA_slope = 0,
                           I_slope = 0)

# Fill the table for each pairwise comparison:
for(i in 1:length(species_list$Taxa)){ #Start of loop
  RA <- Net_merged_data %>%
    filter(Taxa == species_list$Taxa[i],
           Index_Type == "RA")
  RA_I <- Net_merged_data %>% 
    filter(Molecular > 0, Microscopy > 0) %>%
    filter(Taxa == species_list$Taxa[i],
           Index_Type == "RA_Index")
  
  # Calculate the metrics:
  lm1 <- lm(sqrt(Microscopy) ~ sqrt(Molecular), data = RA)
  lm2 <- lm(sqrt(Microscopy) ~ sqrt(Molecular), data = RA_I)
  species_list$RA_Rsq[i] <- summary(lm1)$adj.r.squared
  species_list$RA_p.val[i] <- summary(lm1)$coefficients[2,4]
  species_list$RA_B0[i] <- lm1$coefficients[1]
  species_list$RA_slope[i] <- lm1$coefficients[2]
  #species_list$I_Rsq[i] <- summary(lm2)$adj.r.squared
  #species_list$I_p.val[i] <- summary(lm2)$coefficients[2,4]
  species_list$I_B0[i] <- lm2$coefficients[1]
  species_list$I_slope[i] <- lm2$coefficients[2]

}

species_list %>% 
  write_csv("./Figure_output/Tab.2_Regressions.csv")

These taxa have significant correlations:

species_list %>%
  filter(RA_p.val > 0.05) %>% 
  pull(Taxa)
##  [1] "Aetideus"        "Beroe"           "Calliopius"      "Candacia"       
##  [5] "Clione"          "Cyphocaris"      "Discoconchoecia" "Euphausia"      
##  [9] "Merluccius"      "Microcalanus"    "Munida"          "Nanomia"        
## [13] "Oikopleura"      "Pandalus"        "Paraeuchaeta"    "Pasiphaea"      
## [17] "Scina"           "Thysanoessa"     "Triconia"

Whuile correlation coefficients are the same using RA and RA-index, the slope will differ. Here we plot the difference in slope between the two standardisation methods.

species_list %>%
  filter(RA_p.val<0.05) %>% 
  select(RA_slope, I_slope) %>%
  pivot_longer(1:2) %>%
  mutate(Method = factor(name, levels = c("RA_slope", "I_slope")),
         `Slope (Molecular ~ Microscopy)` = value) %>%
  ggplot(aes(Method, `Slope (Molecular ~ Microscopy)`)) +
  geom_violin() +
  geom_point() +
  theme_minimal_hgrid() +
  scale_y_log10() +
  theme(aspect.ratio = 1,
        panel.background = element_rect(colour = 'darkgrey'))
## Warning in transformation$transform(x): NaNs produced
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## Warning in transformation$transform(x): NaNs produced
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## Warning: Removed 4 rows containing non-finite outside the scale range
## (`stat_ydensity()`).
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("./Figure_output/Fig.S3C_SlopeDifferences.pdf", width = 4, height = 5)
## Warning in transformation$transform(x): NaNs produced
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## Warning in transformation$transform(x): NaNs produced
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## Warning: Removed 4 rows containing non-finite outside the scale range
## (`stat_ydensity()`).
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_point()`).

5. Temporal comparison of taxonomic markers

The aim of this section is to compare if seasonal patterns of RA_index overlap between the different markers and methods. Zooplankton net tows (Count, COI and 18S) that are depth integrated (only one sample per sampling date) are plotted differently from the eDNA samples (that has a depth resolution with several samples per sampling date).
Genus level comparison:

Zooplankton Seasonality

Plot seasonal trend of bulk zooplankton samples

# Combine the zooplankton datasets at genus level
tmp <- bind_rows(mutate(Net_count_2017_index_genus, marker = "Count"),
          mutate(Net_COI_2017_index_genus, marker = "COI"),
          mutate(Net_18S_2017_index_genus, marker = "18S")) 

# Arrange the taxa according to there total annual biomass
order <- tmp %>% 
  group_by(Taxa, marker) %>% 
  summarize(Abundance = sum(Abundance)) %>% 
  pivot_wider(names_from = "marker", values_from = "Abundance") %>% 
  arrange(-Count) %>% 
  pull(Taxa)
## `summarise()` has grouped output by 'Taxa'. You can override using the
## `.groups` argument.
tmp %>% 
  # Filter taxa (all) and order factors
  filter(Taxa %in% GenusList) %>%
  mutate(marker = factor(marker, levels = c("Count", "COI", "18S"))) %>%
  mutate(Taxa = factor(Taxa, levels = c(order))) %>% 
  
  # Plot
  ggplot() +
  geom_line(aes(Date, RA_Index, color = marker, linetype = marker)) +
  facet_wrap(facets = "Taxa", ncol = 4) +
  
  # Esthetics
  theme_minimal_hgrid() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
        strip.text = element_text(face = "italic"),
        legend.position="none",
        panel.background = element_rect(fill = '#F5F5F5', colour = 'darkgrey'),
        aspect.ratio = 1/2.5) +
  scale_color_manual(values = c('#fec44f','#4daf4a','#377eb8'))

ggsave("Figure_output/Fig.5_ZooplanktonSeasonality.pdf", width = 8, height = 10)

RA_index of eDNA samples, the average of 18S and COI.

Zooplankton Vertical Distribution

Plot verical distribution of zooplankton over the season.

# Combine and merge COI and 18S eDNA components. Calculate mean of COI and 18S
eDNA <- bind_rows(mutate(eDNA_COI_2017_index_genus, marker = "COI"),
                  mutate(eDNA_18S_2017_index_genus, marker = "18S")) %>% 
  group_by(Taxa, Date, Depth) %>% 
  summarize(RA_Index = mean(RA_Index),
            Abundance = mean(Abundance),
            marker = "eDNA")
## `summarise()` has grouped output by 'Taxa', 'Date'. You can override using the
## `.groups` argument.
# Combine eDNA data with Net data components.
data <-  eDNA %>%
  bind_rows(mutate(Net_count_2017_index_genus, marker = "Count"),
             mutate(Net_COI_2017_index_genus, marker = "COI"),
             mutate(Net_18S_2017_index_genus, marker = "18S")) %>% 
  filter(Taxa %in% GenusList,
         Taxa %in% pull(eDNA, Taxa)) %>% 
  mutate(Depth = as.character(Depth)) %>% 
  
  # Modify the parameter shown on the categorical Y axis. That is depth for eDNA samples and Marker for Net samples". 
  mutate(y = ifelse(marker == "eDNA", Depth, marker)) %>% 
  mutate(y = factor(y, levels = rev(c("Count", "COI", "18S", "0", "5", "30", "100", "260")))) %>% 
  mutate(Taxa = factor(Taxa, levels = order))

data %>%  ggplot() +
  
  # Add filter water data:
  geom_point(
    mapping = aes(Date, y, color = RA_Index),
    #data = filter(data, y %!in% c("Count", "COI", "18S")),
    shape = 15) +
  scale_color_gradient2(
    high = '#984ea3', low = "white") +

  # Add Microscopy data
  new_scale_color() +
  geom_point(
    mapping = aes(Date, y, color = RA_Index),
    data = filter(data, y %in% c("Count")),
    shape = 15) +
  scale_color_gradient2(high = '#fec44f',
                       low = "white") +

  # Add Net COI data:
  new_scale_color() +
  geom_point(
    aes(Date, y, color = RA_Index),
    filter(data, y %in% c("COI")),
    shape = 15) +
  scale_color_gradient2(high = '#4daf4a',
                      low = "white") +
  
  # Add Net 18S data:
  new_scale_color() +
  geom_point(
    aes(Date, y, color = RA_Index),
    filter(data, y %in% c("18S")),
    shape = 15) +
  scale_color_gradient2(high = '#377eb8',
                       low = "white") +
  
  
  # Set plot themes
  #theme_map() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
        strip.text = element_text(face = "italic"),
  #      legend.position="none",
  #      panel.background = element_rect(fill = '#F5F5F5', colour = 'darkgrey'),
        aspect.rtio = 1/2.5) +
  facet_wrap("Taxa", ncol = 3)
## Warning in plot_theme(plot): The `aspect.rtio` theme element is not defined in
## the element hierarchy.

ggsave("Figure_output/Fig.6_VertivalSeasonCom.pdf", width = 7, height = 10)
## Warning in plot_theme(plot): The `aspect.rtio` theme element is not defined in the element hierarchy.
## The `aspect.rtio` theme element is not defined in the element hierarchy.

Species level comparison

The robustness of the eDNA index transformation can be visualized by repeating the trends at species level, and also including other sample techniques.

# Combining datasets

tmp <- bind_rows(mutate(Net_count_2017_index_species, marker = "Count"),
          mutate(Net_COI_2017_index_species, marker = "COI_bulk"),
          mutate(EtOH_COI_2017_index_species, marker = "COI_sup"))

# Order by biomass
order <- tmp %>% 
  group_by(Taxa, marker) %>% 
  summarize(Abundance = sum(Abundance)) %>% 
  pivot_wider(names_from = "marker", values_from = "Abundance") %>% 
  arrange(-Count) %>% 
  pull(Taxa)
## `summarise()` has grouped output by 'Taxa'. You can override using the
## `.groups` argument.
# Arrange factors and plot. 
tmp %>%
  filter(Taxa %in% SpeciesList) %>%
  mutate(marker = factor(marker, levels = c("Count", "COI_bulk", "COI_sup"))) %>%
  mutate(Taxa = factor(Taxa, levels = order)) %>% 
  ggplot() +
  geom_line(aes(Date, RA_Index, color = marker,  linetype = marker)) +
  facet_wrap(facets = "Taxa", ncol = 4) +
  
  # Esthetics
  theme_minimal_hgrid(8) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
        strip.text = element_text(face = "italic"),
        #legend.position="none",
        panel.background = element_rect(fill = '#F5F5F5', colour = 'darkgrey'),
        aspect.ratio = 1/2.5,
        legend.position = "none") +
  scale_color_manual(values = c('#fec44f','#4daf4a',"cyan"))

ggsave("Figure_output/Supp.Fig.S5_SeasonalBulkSpecies.pdf", width = 10, height = 7)

Data ammount

The ammount of sequences per taxa can hint at the reproducibility of the seasonal trends of the zooplankton. Species with low data may have less reliable results.

  # Combine data sets
  bind_rows(
    mutate(Net_count_2017_index_genus, marker = "Microscopy"),
    mutate(Net_COI_2017_index_genus, marker = "COI"),
    mutate(Net_18S_2017_index_genus, marker = "18S")) %>% 

  filter(Taxa %in% GenusList) %>%
  select(Taxa, Date, Abundance, marker) %>%
  group_by(Taxa, marker) %>%
  summarise(Abundance = sum(Abundance)) %>% 
  pivot_wider(names_from = "marker", values_from = "Abundance") %>%
  arrange(-Microscopy) %>% 
  write_csv("Figure_output/Suppl.Tab.S4_DataAmmount.csv")
## `summarise()` has grouped output by 'Taxa'. You can override using the
## `.groups` argument.

6. Session info

These packages and versions were used in this script.

sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] vegan_2.6-4       lattice_0.22-5    permute_0.9-7     ggnewscale_0.5.0 
##  [5] patchwork_1.3.0   ggpubr_0.6.0      cowplot_1.1.3     ggspatial_1.1.9  
##  [9] ggOceanMaps_2.2.0 eulerr_7.0.0      UpSetR_1.4.0      lubridate_1.9.3  
## [13] forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4       purrr_1.0.2      
## [17] readr_2.1.5       tidyr_1.3.1       tibble_3.2.1      ggplot2_3.5.1    
## [21] tidyverse_2.0.0   readxl_1.4.3     
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-164       sf_1.0-15          bit64_4.0.5        tools_4.2.1       
##  [5] backports_1.4.1    bslib_0.7.0        utf8_1.2.4         R6_2.5.1          
##  [9] KernSmooth_2.23-22 DBI_1.2.1          mgcv_1.9-1         colorspace_2.1-0  
## [13] withr_3.0.0        tidyselect_1.2.0   gridExtra_2.3      bit_4.0.5         
## [17] compiler_4.2.1     textshaping_0.3.7  cli_3.6.2          labeling_0.4.3    
## [21] sass_0.4.9         scales_1.3.0       classInt_0.4-10    proxy_0.4-27      
## [25] systemfonts_1.0.5  digest_0.6.35      rmarkdown_2.26     pkgconfig_2.0.3   
## [29] htmltools_0.5.8.1  fastmap_1.1.1      highr_0.10         rlang_1.1.3       
## [33] rstudioapi_0.16.0  jquerylib_0.1.4    farver_2.1.2       generics_0.1.3    
## [37] jsonlite_1.8.8     vroom_1.6.5        car_3.1-2          magrittr_2.0.3    
## [41] Matrix_1.6-5       Rcpp_1.0.12        munsell_0.5.1      fansi_1.0.6       
## [45] abind_1.4-5        lifecycle_1.0.4    stringi_1.8.4      yaml_2.3.8        
## [49] carData_3.0-5      MASS_7.3-60.0.1    plyr_1.8.9         grid_4.2.1        
## [53] parallel_4.2.1     crayon_1.5.2       stars_0.6-4        splines_4.2.1     
## [57] hms_1.1.3          polylabelr_0.2.0   knitr_1.46         pillar_1.9.0      
## [61] ggsignif_0.6.4     glue_1.7.0         evaluate_0.23      vctrs_0.6.5       
## [65] tzdb_0.4.0         cellranger_1.1.0   polyclip_1.10-6    gtable_0.3.5      
## [69] cachem_1.0.8       xfun_0.43          broom_1.0.5        e1071_1.7-14      
## [73] rstatix_0.7.2      ragg_1.2.7         class_7.3-22       units_0.8-5       
## [77] cluster_2.1.6      timechange_0.3.0